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A strictly hierarchical message transfer scheine 
requires that a message follow a specified referral path unless 
finally it is either rejected or filled at any one of the information 
centers of the network. Thus at each node in the network three 
decisions can be made; satisfy, reject or refer the message to the 
succeeding node in the hierarchy. Associating probabilities and costs 
with each of these decisions, we develop a Markovian model for the 
total network cost. The mean and variance of total cost are derived. 
Applicability of the model is discussed by considering the problems 
related to the estimation of necessary parameters, in particular, a 
queue theoretic model is developed for estimating response time for a 
message at an information center. (Author) 
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ABSTRACT 



A strictly hierarchical message transfer schetne requires that a 
message follow a specified referral path unless finally it is either 
rejected or filled at any one of the information centers of the net- 
work# Thus at each node in the network three decisions can be tnade; 
satisfy^ raject or refer the message to the succeGidlng node in the 
hierarchy. Associating probabilities and costs with each of these 
daclsioos, wa de^rolop a Markcvian model for the total network cost. 
The maan and variance of total cost are derived. Applicability of 
the model is discussed by eonaidering the problems related to the 
estimation of necassary paratneters . In particular j a queue theoretic 
model is developed for as timatlng reaponse tii^ for a tnessage at an 
information center . 



O 

ERIC 



2 



Introduction 



O 

ERIC 



I* ______ 

In Nance, Korfhage and Bhat [1] an information network is defined as a 
sextuple 

N = A, f, £'} 

where the components of N are defined as lb»elows The entities £/ , X and C are 
the set of nodes in the network representing the users » informationi resources 
and information centers respectively. We require that with each information 
center cb€ thers he associated a non-empty set uei/ of users ^ or a non-empty set 
i£j of information resources, or both. A is the set of directed arcs on 
£/UXO<C' where an arc denotes that is directly aecaasible from 

and where each arc <v.,v.> joining nodes of C carries one or both of the labels 
m - denoting possible message (request) transfer from to , or 
d - denoting possible document (respdiise) transfer from to ^ 

Filially, f and f * are mathematical functions that define the in forma t i on trans f er 
structure of the network for the message and document transfers respectively. 

Using the structural properties of C different types of networks are identi- 
fied^ One of these is the strictly hierarchical network and the purpose of this 
paper is to study some of its operational characteristics. 

II, Strictly Hierarchical Message Transfer Structure 
Let 



G ^ ares with label m joining nodes in C>\ 

then, from [1] we have the following definition. 

An Information network ^ is strictly hierarchical if the graph obtained by 
replacing all 2-cycles In G by an undirected edge is an undirected tree (see [2] 
for graph theoretic terminology) , In such a network any open path joining the 
root to a leaf is called a limb . The message transfer structure (m«t*a.) In a 
strictly hierarchical network is illustrated in the following figure* 
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(a) Stricly Hierarchical Structure. 




(b) Replacement of all 2-oycles in G by an 

undirected edge produces an undirected tree; 

(Vi.V^.Vg), <V3^.V2,Vg), (Vj^,V3), 

and ^^2, lin^s. 
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In other words, in a strictly hierarchical network a message follows a 
unique open path unless finally it is either rejected or filled at any one 
of the information centers* In view of this, the strictly hiererchical net- 
work has one of the tnost restrictive m-t-a*; and it iir^oaes a unique ordering 
on the referrals made in the network. Note that other m-t^s- also imply an 
ordering on the referrals, though not neceasarily unique, as demonstrated below i 
Consider a network with centers C^, and a massage arriving at 



"N 



an arbitrary center G. , If the message cannot be satisfied at C. , it can be 

Q o 

referred to one or more of the remaining centers for action* For operational 

effectiveness this referral should take into account the likelihood of satis- 
fying the message at a center and the associated message transfer costs. Let 
h^ j • * # jhj^ j be the probabilities that a message is satisfied at centers 






respectively* Also let be the cost of referring a message from 



eentar C. to center C., This cost contains two components: transmission cost 

■ 3. 3 

and processing cost. Any center involved in the message referral from to 
Cj will cause a transmission cost, A processing cost for a center is incurred 
if the decision is to examine the message at that center before referral. For 
the present discussion we assume only transmission costs for the intermediate 
centers in the referral paths from C . to C , • 

Under the above assumptions consider a sequential order of centers for 
mess age referrals based on the following scheme. 

Let 

= {{7-C* } 

O Q 

k 

s, = {C - U C. } 



k+1 



£-0 ^£ 
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k+1 






k+1 
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i h. 



- 4 - 



This implies that the center C in the referral path of a message 

^k+1 ® 

originating at C, is chosen so as to miniinize the cost /probability ratio 

^^k^h regard to the information centers not covered by the message so 

far. By a slight generalization of [ 3 ], this scheme can be shown to result 
in the least expected eost for the entire message referral process. 



This, we feel, is 3 sufficient justification to consider the strictly 
hierarchical network for more extensive analysis. In view of the uncertainties 
involved in the in*f s. a stochastic model for the referral scheme and the 
associated costs can be used to seek a better understanding of their economic 
Implications. The effectiveness of the network structure and operation is 
reflected in the mean and variance of the total cost for the m*t*s- These- 
characteristics also serve as useful criteria for the design of sivnllar networks 
111. A Probabilistic Model for Network Cost 



Let be the L centers constituting a single limb in a strictly 

hierarchical network such that is the root and is the leaf. The message 
referral path is then given by Associated with each center 

are three outcomes with respect to each message; rejection, satisfaction or 



referral to the next center in the limb. For a given message currently at center 
let and respectively be the probabilities of these outcomes. 

In constructing a probabilistic model for network cost we shall use the 
theory of finite Markov chains [4,5]. In order to model the message transfer 
process in a hierarchical network as a Markov chain, we identify the L centers 
as L states for the process and add two absorbing states C ^ and Cq. These 
Indicate the mode of ultimate disposition of the message. State represente 
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final rejection of the message and state Cq, a satisfactory disposition. At 
any time the message transfer process cm be considered to be occupying any 
one of the L+2 states 









It should be noted that once a message reaches either of the states G ^ and Cq , 
it remains there with probability 1* This is represented in the transition 
probability matrix given below, 

-^1 0 1 2 3 .. L--1 L 





(3.1) 



We represent the matrix P as 



(i-K) 



(3.2) 



where I, R and Q correspond to the partitions tnade in (3,1) « 

Let -^Ij0,lj2,«. L) be the cost associated with a transaction 

from state to in one step. This may be the actual cost of message transfer 
or delay encountered in a transaction or a combination of the twoi At this stage 
we shall assume the existence of such costs for the model. Subsequently we 
explore the possibilities of using response time as part of a cost function. 
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The Cost can be assumed eithet r^dom or determiriistiQ# When assiimed 

randon, we denote its first two moments by 



Let K be the total network cost Irt a given length of time. This is 
comprised of costs of messages originating at different centers. Let be 
the cost associated with a message originating at center before eventually 
^ it is either rejected or filled by one of the centers in the network. Wienever 
the message is referred to state j ^ certain costs are incurred based on the 
possible actions taken. These are 
(1) rejection with cost c, t 




(3.3) 



and 




(3.4) 



(2) Satisfaction with cost c. 



jO 



(3) referral with cost c 



Let m, . be the cost associated with such a visit so that 






i 



tn. 

i 




(3.3) 



For m. . we have 



m, - 0 

rj 




with probability p^ 
with probability p^q 
with probability , 



(3.6) 
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m. , = 
ij 






with probability l"^iO 

with probability p^ 



j < i- 



Let 



y. . ~ E(in. .) 
^3 iJ 



0 . . ^ V (m . . ) 
iJ ij 



(3.7) 

(3.8) 

(3.9) 



for i, i-1,2, * . . ,L. The matrices with y.. and a., as elaTnents are denbted by 
Af and S respectively. 

In the following section we use relations (3.6) and (3.7) to derive ex“ 

2 

press ions for y, , and o,, (i, j^l,2p . . * ,L) and the total network cost K. 

IV. Mean and Variance of Message Referral Cost 

Taking expectations of (3.6) and (3.7) we get 

^±,^1^1 j-l^^i0^i0°^^i * i-l^i ^±-1 






j < i 



Let 



«D - 






11 



y 



22 






(4.1) 

(4.2) 



(4.3) 



Now, expressing (4.1) and (4,2) in matrix notations we get 



ERIC 



or 



M = + QM 



M - (I-Q)“^Af_ 



provided (I— Q) ^ exists . 



(4.4) 



(4.5) 
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Clearly 



CI-Q) = 




Is non— singular with inverse 

r 




0 

0 

0 



■L,L-1 



32 




( 4 , 6 ) 



( 4 . 7 ) 



Pl,L-iPl-1,L-2**P21 • • • J 

thus giving an explicit expression for M. 

Let be the number of messages originating at center (1^1,2 ,L) 
during a given period. Then, the expected value of the total network coat K 
can be expressed as 



E(K) -Ini y . 
i=l ^ 



( 4 . 8 ) 



If is a random variable, we get 



L L 

E(K) = I E(n ) I y . 

i-1 j-1 



( 4 . 9 ) 
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To derive the corresponding varianees of coats, we square both sides of 

equations (3.6) and (3.7), to get 

2 



m, , - k 

XX 



and when j ^ 



2 

m. . 



^ c with probability p, ^ 

2 

with probability 

2 

c, ^ with probability p. . - 
X , X“^x i _ i^l 



0 with probability l"^^iO 

2 

^i-1 j with probability p^ ^ ^ 



Taking expectations we can write 
2 



and 



E (m . . ) = p . ^ n . + p . p . - , n , , » 

XX *^x,-l x,-“l ^xO xO ^i,X"l'x,x-l 

E Cm? . ) " p . . - E (m? . . ) 

xj ^x,x-l x-l,j^ 



J < i 



’^1 Plo’^iO 



(4.10) 



(4.11) 



(4.12) 



(4.13) 



(4.14) 



and 



H = 



1 r. 0 



(4.15) 



We obtain as before 

||E(mJ)|| - (I-Q)“^H 



(4.16) 



where we have used | |x J j to denote the tnatriK with x. . as elements « Let 
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^11 


^12 • 


' * ^IL 


^21^ 


CM 


2 

* * ^2L 


• 


* 


• 


• 2 

h .1 


• 2 

h .2 • 


*2 

* • ^1 



(4.17) 
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Then we can write 



S = (I-Q) - M^, 

To derive the variance of the total cost K we also need 



(.i 



E[< I m.y]. 
i=l 



We have 



•i- j 



j-i 



« 



From (3.6) and (3*7) we get 



m. = 



^iO 



c * . - “Hn . =, 



with probability 



with probability 



with probability p 



i,±-l 



(^ 



Squaring both sides of (4.20) and taking expectations we get 
2 

E(m.) = p. ,r|. T +P-nh.r. + P- J tO. . n 
1 1,-1 1,-1 -lO lO -1,1-1 1,1-1 

2 

+ p. . TE(m. .)-H2p. , iE(c. , ^m. -), 

1»3L--1 X-1 ^l,l“l ijl-1 1“1 



(4 



In the hierarchical network structure c, ^ - and m. - can be asBumed to 

i^i-1 1-1 

independent random variables | hence 



^^'^l,l-l"‘i-l^ " '^1,1-l^^i-l 

L 

where we have written J = E(m^). 



(4 



Let 









Pai'^ai ° 



0 

0 



^32^32 0 



0 

0 



0 

0 



Pl,L-1Yl,L-1 ° 



(4 
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. 21 ) 

be 
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In matrix notation (4.24) can then be written as 



E(m^) 

E(m2) 






+Q 



Km^) 

ECm^) 



+2T 



(4.24) 






TIt 






Vt 



which on r^—arrangainant glvas 



E (m„) 






= (I-Q)‘ 




+2T 



a J 



(4.25) 



Let n, be the number of messages originating at center C , , as assumed 



before. We also assume that E(n.) and E(n.) exist and are known. Let 

(1) _(2) ^"i^ 



m 



m . 9 • * • ^ 



be the costs associated with these n, messages. Total 



cost of all the messages originating at center ie given by 

K. -„<!>+„«) ___ ^ <V , 

1 , X X i 

We assume that costs m£ (r-1,2, . • . are Independent of the number of 
messages The vari^ce of this sum is given by 



n^ 

v( I = v(n^) [E(m^)3^ + E(n.)v(m^) 

r-1 



(4.26) 



When are deterministic, we have 



I = n^V(m^) 

r=l 



(4.27) 
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For the total cost K, we have 






'(K) - VC J + 1 ^2 



(r) 



r-1 



r=l 



.. + I <-i^h 

r=l 



n_, n. 

■ I V( + 2ll cov( z m<'), „ 

i=l r^l ±<j r=l k=l J 






(4.28) 



When are random the covariance term on (4.28) is very involved. When 

they are constant, however, 



n , n . 

'^(r) 






where 



*ifi. 



P±,±-lPi-l,l-2"*Pfi,+l,fi, * 

0 otherwise , 



£<1 



(4.29) 



Thus we get 



V(K) = I niVCm.) + 2ll ^q.^q <n -y 2^^ 



i<j “ ^£=1 



(4.30) 



V • Use of Model for Evaluation and Design - 

Introduction of a mathematical model is only of academic interest unless 
its usafulnass in solving real world problems is explained. In this section 
we discuss the requirements for applying the modal developed in the preceding 
two sections. 

The Markovian model can be used in two different situations: (1) evaluation 

of existing networks and (2) development of design criteria for new networks. 

I 

The decision to apply the Markovian model to an existing network introduces 
the problem of parameter estimation* The basic parameters in our model are 
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the probability elements of the matrix P in (3«1)# Maximum likelihood 
estimates of these probabilities are given by the fraction of messages 
referred from one center to another [6]. That is, if is the number of 



For an operating network, one can derive means and variances of costs 
incurred by the message referral process. No theoretical forms for dis- 
tributions need be assumed; empirical distributions can be used. Also 
different forms of cost functions can be used to describe the situation. Once 
these parameters have been estimated, the expected message cost can be obtained 
using formulas developed in section IV and eompai'isons can be made between 
two or more existing networks regarding their coat and effectiveness* 

The extensive data collection necessary for model construction is provided 
by the statistical reporting systems currently being implemented on digital 
computers. An example is the TALON Medical Library Network’s TRIPS System [7], 

Applying the Markovian model to network design poses several problems re- 
lated to parameter estimation, some of which can be handled through standard 
techniques. Sufficient data may exist for the estimation of parameterss 

(1) relating to massage arrivals, 

(2) describing the information resources, 

(3) providing the message referral probabilities, 

(4) indicating alternative message transfer modes, and 

(5) furnishing costs involved throughout the network. 



One estimate that is not easily obtained from data is the response time for 



as a function of message arrival and service rates at individual centers of the 




(5.1) 



network requests ♦ We must use a mathematical model to estimate response time 
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h 




ne two rk • 

The mathematical, model proposed for this purpose is a simple model from 
queueing theory [8. 9], At each center the arriving messages are analogous 
to customers in a queueing system and the processing of messages to the 

service function. With this similarity in mind, we make the following 
assumptions • 



(1) At center users initiate request for Information at the rate 

unxt timsp 

(2) Center also receives requests from center C at the rate of 

per unit time. Let 



- ^li + VlPi+l,i 

Thus is the combined arrival rate at center C. 

i . 

(3) %e combined arrival process has the characteristics of a 
Poxsson process with parameter X. j i.e., if A(t) is the 

number of messages arriving during an interval of length 
t , then ® 



(5.2) 



Pr{A(t)*n} = e ^ 



X.t(X.t) 



n 



X 

n i 



(n=0,l,2, . . . ) 



(5.3) 



(4) The message processing times at center C are independent and 

moraenrb? and second 

times are independent of the message arrival process 
and the number of messages waiting to be processed. 

Let be the utilization factor for center defined by 



p _ Rate of arrival of messages _ . 1 

i Rate of processing ^ ^x^i * 

Let be the expected response time at center which is defined as the 

time interval from the arrival of a message until its disposition at that 
center . 

Based on the assumptions above, we can Identify the message processing 
at a center with the operation of a single server queueing system with similar 
characteristics. Then the response time is given by [equation (1.196) of [9]] 

16 
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R. 



X 




+ 






(5.5) 



In (5*5) estimates of parameters can be obtained through standard 
techniques; therefore, estimates of R^ (1=1, 2 , * , . ^L) can be derived* These 
estimates can be used in the determination of the expected costs - of (3-3) 
either directly or in conjunction with other factors such as message transfer 
costs * 

VI, Summary and Discussion 

Taking into account the uncertainties associated with the decisions made 
at each center of a strictly hierarchical network, a probabilistic model is 
developed for the network cost* Expressions are given for the mean and 
variance of the cost, and tnathods are suggested for the estimation of model 
parameters required for application. 

The strictly hierarchical network is very restrictive as evidenced by the 
flexibility measure developed in [1] as well as the transition probability 
matrix presented in (3.1), Even though the discussion in section II justifies 
the use of the strictly hierarchical structure in many practical situations, 
many types of network operations exist which do not belong to this class 
[1, section III]- In such cases the transition probabilities of the message 
referral schemes are usually non-statlonary | thus further research is needed 
for their analysis. 

In section II, a sequential order of centers for message referrals is 
presented assuming that the referral cost c, * between centers G, and G, are 
Independent of the centers through which the message is referred, A more 
realistic approach is to consider the p rob abilities of satisfying the request 
at intermediate centers - 
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The response time estimate is derived in (5*5) under very restrictive 
assumptions* In assumption (3) of aectlon V, the combined arrival process 
is assumed to have the characteristics of a Poisson process. If the user 
requests at each center follow a Poisson process and the processing times 
at each center can be represented by independent and identically dis- 
tributed random variables with the negative exponential density function 



than, in the long run, the Poisson assumption is Justified. ITote that we 
have not assumed any specific form for the processing time distribution 
in assumption (4), This reflects our belief that the mixture of different 
types of arrivals at a center may justify the Poisson assumption even when 
the processing time is not exponentially distributed. This contention needs 
to be tested through data collected from network operation. 

In assumption (4) of section V a single set of first and second moments 
has been assumed for processing times at a canter irrespective of the nature of 
message disposition. Use of different sets of parameters for different types 
of messages should present no problems in the extension of this model. 

In the above discussion we emphaslge that the present investigation solves 
only some aspects of the general problem* We believe that further research on 
the remaining aspects should provide a strong theoretical base in the analysis 
and design of information networks. 







x>0 



( 6 . 1 ) 
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